## Cache and Performance

Vo Hieu Nghia

Ex1:

Maximum memory size 4GB = 2^32 🡺32 addresses bits

Block size 64 bytes = 2^6

* Bits number of block offset : 6
* Block number = 2^(32-6) = 2^26

Nr of lines = 128 kB / (1 x 64B) = 2048 = 2^11

* Number of bits in index : 11
* Number of bits in tag : 15



Nr of lines = 128 kB / (2 x 64B) = 1024 = 2^10

* Number of bits in index : 10
* Number of bits in tag : 16

c)

Nr of lines = 128 kB / (8 x 64B) = 256 = 2^8

* Number of bits in index : 8
* Number of bits in tag : 18

Ex2 :

Miss\_Penalty\_L3 = Main Memory Latency (DDR3-1600 CAS7) = 107

Miss\_Penalty\_L2 = Hit\_Time\_L3 + Miss\_Rate\_L3 x Miss\_Penalty\_L3

= 39 + 0.004 x 107

= 39.428

Miss\_Penaty\_L1 = Hit\_time\_L2 + Miss\_Rate\_L2 x Miss\_Penalty\_L2

= 11 + 0.012 x 39.428

= 11.473

AMAT\_Nehalem = Hit\_Time\_L1 + Miss\_Rate\_L1 x Miss\_Penalty\_L1

= 4 + 0.029 x 11.473 = 4.332

2/ Intel Penryn:

Miss\_Penalty\_L2 = main memory access = 160

Miss\_Penaty\_L1 = Hit\_time\_L2 + Miss\_rate\_L2 x Miss\_Penalty\_L2

= 15 + 0.004 x 160 = 15.64

AMAT\_Penryn = Hit\_Time\_L1 + Miss\_Rate\_L1 x Miss\_Penalty\_L1

= 3 + 0.029 x 15.64 = 3.453

Ex3:

|  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
| **Block number in sets** | **Referenced block** | | | | | | | |
|  | **3** | **2** | **1** | **0** | **2** | **0** | **3** | **1** |
| **0** |  |  |  | 0 | 1 | 0 | 1 | 2 |
| **1** |  |  | 0 | 1 | 2 | 3 | 3 | 0 |
| **2** |  | **0** | **1** | **2** | **0** | **1** | **2** | **3** |
| **3** | 0 | 1 | 2 | 3 | 3 | 3 | 0 | 1 |

1. LRU : block 2
2. FIFO: block 3

Ex4:

Address: 0x8F3ED5 with bus 32 bits :

0000 0000 1000 1111 0011 1110 1101 0101

Move it to the right 6 bits (block offset):

0000 0000 1000 1111 0011 1110 11

Or we can write in more appropriate way with 4 bits group: 0000 0000 0000 0010 0011 1100 1111 1011 = 0x23CFB = 146682

Number of bits in tag : 18 🡺 count 18 bits (from the left)

0000 0000 0000 0000 0000 0010 0011 1100 or 0x23C = 572